Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add decrease interface for aggregation #9737

Open
wants to merge 39 commits into
base: master
Choose a base branch
from

Conversation

xzhangxian1008
Copy link
Contributor

@xzhangxian1008 xzhangxian1008 commented Dec 20, 2024

What problem does this PR solve?

Issue Number: ref #7376

Problem Summary:

What is changed and how it works?

In order to support aggregation in window function, we need to add some interfaces for aggregation. In this pr, we add `decrease` interface for aggregation.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Dec 20, 2024
@xzhangxian1008
Copy link
Contributor Author

/retest


void decrease(AggregateDataPtr __restrict, const IColumn **, const size_t, Arena *) const override
{
throw Exception("Not implemented yet");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add method name and agg func name in error message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add method name and agg func name in error message?

done

@@ -594,6 +600,10 @@ struct AggregateFunctionMinData : Data
}
bool changeIfBetter(const Self & to, Arena * arena) { return this->changeIfLess(to, arena); }

void prepareWindow() { throw Exception("Not implemented yet"); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like SingleValueDataGeneric derived from CommonImpl, so prepareWindow is not needed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like SingleValueDataGeneric derived from CommonImpl, so prepareWindow is not needed here?

done

@@ -602,10 +612,13 @@ struct AggregateFunctionMaxData : Data
{
using Self = AggregateFunctionMaxData<Data>;

void insertResultInto(IColumn & to) const { Data::insertResultInto(to); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

done

@@ -594,6 +600,10 @@ struct AggregateFunctionMinData : Data
}
bool changeIfBetter(const Self & to, Arena * arena) { return this->changeIfLess(to, arena); }

void prepareWindow() { throw Exception("Not implemented yet"); }

void insertResultInto(IColumn & to) const { Data::insertResultInto(to); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be removed?

done


bool changeIfLess(const Self & to, Arena * arena)
{
if (saved_values != nullptr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary ? you mean prepareWindow() is not always called?
but reset() didn't check if saved_values is nullptr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary ? you mean prepareWindow() is not always called? but reset() didn't check if saved_values is nullptr

Yes, in some circumstance, we will never call decrease function.

@xzhangxian1008
Copy link
Contributor Author

/cc @guo-shaoge @windtalker

@ti-chi-bot ti-chi-bot bot requested a review from guo-shaoge January 7, 2025 06:14
@xzhangxian1008
Copy link
Contributor Author

/cc @guo-shaoge

int recursion_level) const
{
AggregateFunctionCombinatorPtr combinator
= AggregateFunctionCombinatorFactory::instance().tryFindSuffix("NullForWindow");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if the argument is not null, it still need this combinator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if the argument is not null, it still need this combinator?

yes

@@ -75,7 +75,10 @@ class AggregateFunctionForEach final
size_t old_size = state.dynamic_array_size;
if (old_size < new_size)
{
state.array_of_aggregate_datas = arena.realloc(
if unlikely (arena == nullptr)
throw Exception("Get nullptr in ensureAggregateData");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw Exception("Get nullptr in ensureAggregateData");
throw Exception("Get null arena ptr in ensureAggregateData");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}

void prepareWindow(AggregateDataPtr __restrict place) const override
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that for the agg function that is not supported by window operator, do not implement these window function specific interface, since it will never be used and can not be well tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that for the agg function that is not supported by window operator, do not implement these window function specific interface, since it will never be used and can not be well tested.

done

using Self = SingleValueDataFixedForWindow<T>;
using ColumnType = std::conditional_t<IsDecimal<T>, ColumnDecimal<T>, ColumnVector<T>>;

mutable std::deque<T> * saved_values;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need use raw pointer here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need use raw pointer here

we now drop pointer now.

void changeIfLess(const IColumn & column, size_t row_num, Arena * arena)
{
auto to_value = static_cast<const ColumnType &>(column).getData()[row_num];
if (saved_values != nullptr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't we ensure that saved_values must be not null? If the call fallback to SingleValueDataFixed<T>::changeIfLess(column, row_num, arena); it seems meaningless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't we ensure that saved_values must be not null? If the call fallback to SingleValueDataFixed<T>::changeIfLess(column, row_num, arena); it seems meaningless.

The check for saved_values has been deleted.


if (result_is_nullable && need_counter)
{
auto tmp = prefix_size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if prefix_size >= sizeof(Int64) there is no need to use more prefix size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if prefix_size >= sizeof(Int64) there is no need to use more prefix size?

no, when prefix_size == sizeof(Int64) the null flag always needs 1 byte, in this case we need more prefix size, when prefix_size > sizeof(Int64), in order to ensure the alignment, we may still need to add prefix size to ensure the alignment.

if constexpr (is_add)
{
this->addCounter(place);
this->setFlag(place);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why still need the flag if we already has counter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why still need the flag if we already has counter?

Because we inherit AggregateFunctionNullBase and AggregateFunctionNullBase needs this flag.

@xzhangxian1008
Copy link
Contributor Author

/cc @windtalker @guo-shaoge

@ti-chi-bot ti-chi-bot bot requested a review from windtalker January 9, 2025 07:57
@xzhangxian1008
Copy link
Contributor Author

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 9, 2025
if constexpr (is_min)
iter = saved_values.begin();
else
iter = --(saved_values.end());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
iter = --(saved_values.end());
auto iter = std::prev(saved_values.end());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

namespace DB
{
/** Aggregate functions that store one of passed values.
* For example: min, max, any, anyLast.
*/

struct CommonImpl
{
static void decrease(const IColumn &, size_t) { throw Exception(" decrease is not implemented yet"); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why extra space before error msg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why extra space before error msg?

done

@guo-shaoge
Copy link
Contributor

other lgtm

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jan 10, 2025
Copy link
Contributor

ti-chi-bot bot commented Jan 10, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: guo-shaoge

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

ti-chi-bot bot commented Jan 10, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-01-10 05:51:18.42432135 +0000 UTC m=+505621.713153055: ☑️ agreed by guo-shaoge.

@ti-chi-bot ti-chi-bot bot added the approved label Jan 10, 2025
Copy link
Contributor

ti-chi-bot bot commented Jan 10, 2025

@xzhangxian1008: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-integration-test c7817b5 link true /test pull-integration-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants